Rank | Count | Beginning |
---|---|---|
1454 | 619 | Dan |
4810 | 411 | Jagis |
8436 | 307 | Son |
2258 | 293 | Dat |
3569 | 191 | Gieldda |
9464 | 167 | Vári |
2111 | 94 | Dasa |
4708 | 90 | Jagi |
4182 | 74 | Guovllu |
7896 | 74 | Sámi |
8889 | 69 | Su |
3907 | 68 | Go |
9793 | 66 | Vuolde |
9878 | 51 | Vuosttaš |
1077 | 49 | Čoahkkebáikki |
2833 | 47 | Dušše |
354 | 41 | Arrondissemeantta |
2741 | 37 | Doppe |
3362 | 37 | Gávpoga |
1406 | 34 | Dalle |
6780 | 34 | Norgga |
8990 | 34 | Suoidnemánu |
2200 | 32 | Das |
6703 | 32 | Njukčamánu |
8273 | 31 | Skábmamánu |
9033 | 31 | Suoma |
1177 | 30 | Cuoŋománu |
7029 | 30 | Ođđajagimánu |
952 | 29 | Čakčamánu |
4268 | 29 | Guovvamánu |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV